Problem Statement:¶

In the food industry, identifying food items quickly and accurately is essential for applications such as automated inventory management, calorie estimation, restaurant automation, and dietary monitoring. Manual identification is time-consuming, error-prone, and not scalable. Thus, there is a need for an automated, intelligent system that can classify food items from images with high accuracy.

Context:¶

In the era of digital transformation, automated food detection using computer vision has become increasingly important in various sectors such as hospitality, healthcare, fitness, retail, and food delivery. Accurate identification of food items from images enables intelligent systems to recognize what a person is eating, streamline restaurant operations, or even automate checkout processes in cafeterias.

For example, in a smart cafeteria, cameras can detect and identify food items on a tray without manual input, enabling a frictionless billing experience. In diet and nutrition apps, users can take a picture of their meal, and the app can instantly classify the food and estimate nutritional content. In quality assurance for food production, automated systems can detect if the right type of food is being processed or if items are visually defective.

Such applications demand a robust food classification model capable of identifying food items from images with high accuracy, regardless of variations in presentation, lighting, or camera angles. This project aims to tackle this challenge by leveraging deep learning techniques to train a model that can automatically detect and classify different types of food from a diverse dataset of labeled food images.

Data Descriptions:¶

The project uses a curated subset of the Food-101 dataset, a widely used benchmark for food classification tasks. This dataset includes:

500 images categorized into

10 distinct food classes (e.g., apple_pie, fried_rice, sushi)

Each class contains a balanced distribution of training and test images, generally split in a 70-30 ratio

Images vary in lighting, background, and angle to mimic real-world food photography conditions

Each image is labeled with the corresponding food class, enabling supervised learning approaches to be applied effectively.

Project Objective¶

The primary goal of this project is to:

Develop a deep learning-based food identification model that can accurately classify food items from images.

Key objectives include:

Building a convolutional neural network (CNN) model to classify food images into one of the 10 defined categories

Evaluating model performance using standard metrics such as accuracy, precision, recall and confusion matrix.

Enabling a potential real-time application where the trained model can be integrated into camera-based systems for smart kitchens, restaurant automation, or diet-tracking apps

Ultimately, this solution aims to demonstrate the feasibility of intelligent, camera-driven food recognition systems, contributing toward innovations in food technology and AI-driven lifestyle tools.

Step 1: Import the data¶

Importing Required Libraries¶

In [5]:
import os  # File and directory operations
import pandas as pd  # Data handling
import matplotlib.pyplot as plt  # Plotting
import matplotlib.patches as patches  # Drawing shapes on plots
import cv2  # Image processing
import numpy as np

Unzipping the Food-101 Dataset¶

In [6]:
# Define the path to the ZIP file containing the dataset
zip_path = 'Food_101.zip'

# Define the directory where the ZIP file should be extracted
extract_to = 'food101_data'
In [34]:
import zipfile  # Importing the zipfile module to handle ZIP archives
# Open the ZIP file in read mode ('r') using a context manager
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    # Extract all contents of the ZIP file to the specified directory
    zip_ref.extractall(extract_to)

# Print confirmation message after extraction is complete
print("Dataset unzipped!")
Dataset unzipped!

Exploratory Data Analysis¶

Verify Directory Structure¶

In [7]:
# List all files and directories in the specified path 'extract_to'
# 'extract_to' should be a variable that holds the path where your dataset was extracted
os.listdir(extract_to)
Out[7]:
['.DS_Store', '__MACOSX', 'Food_101']

List classes¶

In [9]:
# Join the extraction directory with the 'Food_101' folder to get the full path
food101_dir = os.path.join(extract_to, 'Food_101')

# List all files and subdirectories in the 'Food_101' folder
# This will typically include folders like 'images' and files like 'meta'
os.listdir(food101_dir)
Out[9]:
['ice_cream',
 'samosa',
 'donuts',
 '.DS_Store',
 'waffles',
 'falafel',
 'ravioli',
 'strawberry_shortcake',
 'spring_rolls',
 'hot_dog',
 'apple_pie',
 'chocolate_cake',
 'tacos',
 'pancakes',
 'pizza',
 'nachos',
 'french_fries',
 'onion_rings']
In [10]:
base_path = 'food101_data/Food_101/'  # path to class folders
class_to_images = {}

for cls_name in os.listdir(base_path):
    cls_folder = os.path.join(base_path, cls_name)
    if os.path.isdir(cls_folder):
        image_files = os.listdir(cls_folder)
        class_to_images[cls_name] = image_files
In [11]:
# Summary
total_images = sum(len(v) for v in class_to_images.values())
print(f"Total classes: {len(class_to_images)}")
print(f"Total images: {total_images}")
Total classes: 17
Total images: 16257
In [12]:
for i, (cls, imgs) in enumerate(class_to_images.items()):
    print(f"{cls}: {len(imgs)} images")
ice_cream: 1000 images
samosa: 1000 images
donuts: 1000 images
waffles: 1000 images
falafel: 1000 images
ravioli: 1000 images
strawberry_shortcake: 1000 images
spring_rolls: 1000 images
hot_dog: 1000 images
apple_pie: 257 images
chocolate_cake: 1000 images
tacos: 1000 images
pancakes: 1000 images
pizza: 1000 images
nachos: 1000 images
french_fries: 1000 images
onion_rings: 1000 images

Observation :

  • Total Classes: There are 17 different food categories in your current dataset.
  • Total Images: There are a total of 16,257 food images available.
  • Uniformity: Most classes (like pizza, donuts, pancakes, etc.) have 1,000 images each, showing good class balance.
  • Exception: Only one class, apple_pie, has fewer images (257 only) — this may cause imbalance in training.
  • This dataset is suitable for multi-class image classification, and can also be extended to object detection if bounding boxes are added.

Class Distribution Plot¶

In [13]:
# 1. Class Distribution Plot
classes = list(class_to_images.keys())
counts = [len(imgs) for imgs in class_to_images.values()]

plt.figure(figsize=(12, 6))
plt.bar(classes, counts, color='skyblue')
plt.xticks(rotation=45, ha='right')
plt.xlabel('Food Classes')
plt.ylabel('Number of Images')
plt.title('Number of Images per Food Class')
plt.show()
No description has been provided for this image

Observation:

  • Most classes contain exactly 1,000 images, which is ideal for training.
  • Only one class (apple_pie) has significantly fewer images (257) — this may lead to class imbalance during training.
  • Dataset is well-suited for image classification tasks.

Image Size Analysis (width and height)¶

In [41]:
# Image Size Analysis (width and height)
import random
from PIL import Image
widths, heights = [], []

for cls, images in class_to_images.items():
    sample_images = random.sample(images, min(20, len(images)))  # sample 20 images per class
    for img_name in sample_images:
        img_path = os.path.join(base_path, cls, img_name)
        with Image.open(img_path) as img:
            w, h = img.size
            widths.append(w)
            heights.append(h)

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.hist(widths, bins=30, color='salmon', edgecolor='black')
plt.title('Distribution of Image Widths')
plt.xlabel('Width (pixels)')
plt.ylabel('Count')

plt.subplot(1, 2, 2)
plt.hist(heights, bins=30, color='lightgreen', edgecolor='black')
plt.title('Distribution of Image Heights')
plt.xlabel('Height (pixels)')
plt.ylabel('Count')

plt.tight_layout()
plt.show()
No description has been provided for this image

Image Size Distribution Observation

  • Most images are 512x512 pixels.

    • This indicates the dataset is already quite standardized.
  • A few images have smaller dimensions (e.g., 300, 350 pixels).

    • These are outliers and occur rarely.
  • This consistency is useful for model training.

    • We can resize all images to 512x512 or a smaller fixed size (like 224x224) for deep learning models.
  • No very large or very small images were found.

    • This ensures minimal image distortion during preprocessing.

Visualize the data, showing one image per class from 101 classes¶

In [15]:
# Visualize the data, showing one image per class from 101 classes
# Path to dataset
data_dir = food101_dir  # Assuming `food101_dir` is already defined
foods_sorted = sorted([
    d for d in os.listdir(data_dir)
    if os.path.isdir(os.path.join(data_dir, d))
])


# Total number of classes
num_classes = len(foods_sorted)

# Dynamically define grid size
cols = 6
rows = int(np.ceil(num_classes / cols))

# Create subplots
fig, ax = plt.subplots(rows, cols, figsize=(4 * cols, 4 * rows))
fig.suptitle("Showing one random image from each class", y=1.02, fontsize=24)

# Flatten axes for easier iteration (in case rows * cols > num_classes)
ax = ax.flatten()

for food_id, food_name in enumerate(foods_sorted):
    food_images = os.listdir(os.path.join(data_dir, food_name))
    random_img = np.random.choice(food_images)
    img_path = os.path.join(data_dir, food_name, random_img)
    img = plt.imread(img_path)

    ax[food_id].imshow(img)
    ax[food_id].set_title(food_name, pad=10)
    ax[food_id].axis('off')

# Hide any extra axes if there are unused subplots
for i in range(num_classes, len(ax)):
    ax[i].axis('off')

plt.tight_layout()
plt.subplots_adjust(top=0.93)  # Leave room for suptitle
plt.show()
No description has been provided for this image

Step 2: Map training and testing images to its classes.¶

In [16]:
from sklearn.model_selection import train_test_split
# Adjust path as needed
base_path = 'food101_data/Food_101'

# Get class names from folder names
class_names = sorted([folder for folder in os.listdir(base_path) if os.path.isdir(os.path.join(base_path, folder))])

food_data = []

# Collect image path and class label
for label in class_names:
    folder_path = os.path.join(base_path, label)
    for img_file in os.listdir(folder_path):
        if img_file.lower().endswith(('.jpg', '.jpeg', '.png')):
            img_path = os.path.join(folder_path, img_file)
            food_data.append((img_path, label))

# Create DataFrame
food_df = pd.DataFrame(food_data, columns=['image_path', 'label'])

# Split into train/test (80/20)
train_food_df, test_food_df = train_test_split(food_df, test_size=0.2, stratify=food_df['label'], random_state=42)

print("✅ Mapped images to classes.")
print(f"Train: {len(train_food_df)} images, Test: {len(test_food_df)} images")
train_food_df.head()
✅ Mapped images to classes.
Train: 13004 images, Test: 3252 images
Out[16]:
image_path label
2230 food101_data/Food_101/donuts/2249805.jpg donuts
12195 food101_data/Food_101/samosa/1145678.jpg samosa
13392 food101_data/Food_101/strawberry_shortcake/225... strawberry_shortcake
13828 food101_data/Food_101/strawberry_shortcake/354... strawberry_shortcake
10269 food101_data/Food_101/ravioli/788592.jpg ravioli
In [18]:
food_df
Out[18]:
image_path label
0 food101_data/Food_101/apple_pie/2968812.jpg apple_pie
1 food101_data/Food_101/apple_pie/3134347.jpg apple_pie
2 food101_data/Food_101/apple_pie/3314985.jpg apple_pie
3 food101_data/Food_101/apple_pie/3670548.jpg apple_pie
4 food101_data/Food_101/apple_pie/3917257.jpg apple_pie
... ... ...
16251 food101_data/Food_101/waffles/764669.jpg waffles
16252 food101_data/Food_101/waffles/113651.jpg waffles
16253 food101_data/Food_101/waffles/2364175.jpg waffles
16254 food101_data/Food_101/waffles/3844038.jpg waffles
16255 food101_data/Food_101/waffles/1576252.jpg waffles

16256 rows × 2 columns

Step 3: Create annotations for training and testing images.¶

[Take any 10 foods(class) of your choice and select any 50 images inside each food and create the annotations manually. You can use any image annotation tool to get the coordinates.]

Image Annotation Overview:

To train a model for object detection (such as YOLO, SSD, or Faster R-CNN), we’ve created annotations for selected food classes. These annotations are saved in a CSV file and follow a structured format suitable for model training.

Annotation Task Details

We selected 10 food classes of our choice. wihch is

  • French Fries
  • Apple Pie
  • Nachos
  • Pizza
  • Pancakes
  • Tacos
  • Chocolate Cake
  • Hot Dog
  • Onion Rings
  • Spring Roll

For each food class, we manually annotated 50-60 images.

We used an image annotation tool(Roboflow) to mark bounding boxes (object locations).

The annotation data is saved in a file: Datasetv1/original_images/_annotations.csv

Annotation File Structure:

The CSV file contains the following columns:

Column Description
filename Name of the image file (e.g., pizza_01.jpg)
width Width of the image in pixels
height Height of the image in pixels
class Name of the object class (e.g., pizza, samosa, etc.)
xmin X-coordinate of the top-left corner of the bounding box
ymin Y-coordinate of the top-left corner of the bounding box
xmax X-coordinate of the bottom-right corner of the bounding box
ymax Y-coordinate of the bottom-right corner of the bounding box

This format is commonly used in object detection datasets to describe the position and size of objects within each image.

File & Folder Paths:

Below are the paths used for image data and annotations:

  • Path to the annotation file
    Datasetv1/original_images/_annotations.csv

  • Folder containing the corresponding images
    Datasetv1/original_images/

Step 4: Display images with bounding box you have created manually in the previous step.¶

In [1206]:
# Path to the CSV file containing image annotations (e.g., bounding boxes, labels)
csv_path = 'Datasetv1/original_images/_annotations.csv'

# Path to the folder where the original images are stored
img_folder = 'Datasetv1/original_images/'
In [1207]:
# Load annotations
food_annotations_df = pd.read_csv(csv_path)
In [1208]:
# Display the shape of the DataFrame to check the number of rows and columns
food_annotations_df.shape
Out[1208]:
(558, 8)
In [1209]:
# Display the entire DataFrame to inspect the data including any new columns added
food_annotations_df
Out[1209]:
filename width height class xmin ymin xmax ymax
0 2909830_jpg.rf.bb9125215f38f22139f72d04f19e693... 512 512 Apple Pie 210 43 397 259
1 108743_jpg.rf.260978b4f8ae78f4ebb41f48ef501679... 512 384 French Fries 50 3 442 383
2 149278_jpg.rf.86187fd5bd1698133cb7a973c6060449... 512 384 French Fries 33 0 260 167
3 2986199_jpg.rf.ac0b99e71100520e6608ef72b12ee27... 512 512 Apple Pie 28 37 291 233
4 2934928_jpg.rf.c8f427a0d3e7ba9342fe37276fb15ab... 512 512 Apple Pie 9 54 463 465
... ... ... ... ... ... ... ... ...
553 30292-hotdog_jpg.rf.0390f5521fb9e6e7e3acb2a6a8... 640 640 Hotdog 56 0 640 605
554 14043-hotdog_jpg.rf.8336579be067ac62410422f411... 640 640 Hotdog 48 124 404 526
555 8006-hotdog_jpg.rf.2b5a43d73a7b80e624c778536e2... 640 640 Hotdog 65 45 623 640
556 4345-hotdog_jpg.rf.c81f7d5ae5388487ceea9df4709... 640 640 Hotdog 2 8 640 640
557 51643-hotdog_jpg.rf.2eeb177096d2f26e6f38322d53... 640 640 Hotdog 3 26 483 607

558 rows × 8 columns

In [1210]:
# Extract and display all unique food class names from the dataset
food_classes = set(food_annotations_df['class'])
print("List of unique food categories in the dataset:")
for food in sorted(food_classes):
    print("-", food)
List of unique food categories in the dataset:
- Apple Pie
- Chocolate
- French Fries
- Hotdog
- Nachos
- Pizza
- onion_rings
- pancakes
- spring_rolls
- tacos
In [1211]:
# Check for duplicate filenames in the dataset
duplicate_filenames = food_annotations_df[food_annotations_df.duplicated(subset='filename', keep=False)]

print(f"Total duplicate filenames found: {duplicate_filenames['filename'].nunique()}")
print("List of duplicated filenames:")
print(duplicate_filenames['filename'].value_counts())
Total duplicate filenames found: 32
List of duplicated filenames:
filename
189678-nachos_jpg.rf.f186725dbfe1bc23e9532408103e1060.jpg    5
3004621_jpg.rf.1a70aad430f7fcc72cc14f91446d4c08.jpg          4
7394_jpg.rf.1838448cb2b3d641b167b9cfbca600cc.jpg             4
91964_jpg.rf.0c917d27d8f80e5c630140d81031d231.jpg            3
11193_jpg.rf.afefd57ffc19ba1eeb51afeee3bf37b4.jpg            3
113781_jpg.rf.de10ec12748947f00d231f8c55aaefb8.jpg           3
1030289_jpg.rf.702c29c39daf844a889cc73917369bdd.jpg          3
2618003_jpg.rf.8d18399346288665532d0826566a79eb.jpg          3
2861144_jpg.rf.a9287e2d7af886a3c026273c3349edba.jpg          3
36081_jpg.rf.bcde8146b7446e659e5d17e94d563635.jpg            2
1058697_jpg.rf.187204c8e93dbe0d20f8676a3f9f7c33.jpg          2
110171_jpg.rf.2e6a197703f7096765d773f023bda859.jpg           2
38615_jpg.rf.edfc43b51bb448e7763ffc9c6c3237c3.jpg            2
45817_jpg.rf.b4f80dfda9bea5836fedec2c7b65e578.jpg            2
62663_jpg.rf.d6e00a3b034bc15f515a5fa056ca1733.jpg            2
58787_jpg.rf.a8acae7e04404aeb8ad1c6a5f8b65434.jpg            2
145012_jpg.rf.4544abe395055b02ccd3e1076038f4ff.jpg           2
33259_jpg.rf.56a5b0558bdb03c426e60f6b5f89b8f4.jpg            2
78171_jpg.rf.4712e20db14395cc19199a4f927ec652.jpg            2
36370_jpg.rf.fc4e83fc5c0a333ddd949da6ac871995.jpg            2
62484_jpg.rf.7a9effc3895e6123dcf647b7f92549f6.jpg            2
92235_jpg.rf.53c19df7b5c9ec2f9d0ffcad8470c394.jpg            2
35235_jpg.rf.32771ba6dfe7c36611eee12e9a4076b6.jpg            2
71645_jpg.rf.7c1651d6851e2f6b318c16b37516c9e6.jpg            2
2983047_jpg.rf.0581d006429c601c3b014a9e4abe4b5c.jpg          2
74527_jpg.rf.a53136bdf4e575d077f34c3c1a41b50a.jpg            2
110385_jpg.rf.ed897b8ba0e20976351d7e0777963d00.jpg           2
80540_jpg.rf.134fc69263831ead08ce2f8a43ac5644.jpg            2
1126_jpg.rf.d3ba4b55b4bf612e7af22ea7ff137788.jpg             2
68177_jpg.rf.4286d561950cc21283c4e2b372092ac1.jpg            2
101450_jpg.rf.eddcc68593aa541ba3d9cce8835094be.jpg           2
95572_jpg.rf.a47685e871481cef6935b90644ff7ba5.jpg            2
Name: count, dtype: int64
In [1212]:
# Remove duplicate rows based on filename, keeping the first occurrence
food_annotations_df = food_annotations_df.drop_duplicates(subset='filename', keep='first').reset_index(drop=True)

print(f"Duplicate rows removed. New shape of DataFrame: {food_annotations_df.shape}")
Duplicate rows removed. New shape of DataFrame: (513, 8)
In [1213]:
# Show the distribution of samples across different food classes
class_counts = food_annotations_df['class'].value_counts()

print("Food class distribution (class: count):")
for class_name, count in class_counts.items():
    print(f"- {class_name}: {count}")
Food class distribution (class: count):
- pancakes: 57
- spring_rolls: 53
- tacos: 52
- French Fries: 51
- onion_rings: 51
- Pizza: 50
- Nachos: 50
- Chocolate: 50
- Hotdog: 50
- Apple Pie: 49
In [1214]:
# Display summary statistics about the dataset
total_annotations = len(food_annotations_df)
unique_images = food_annotations_df['filename'].nunique()
unique_classes = food_annotations_df['class'].nunique()

print("Dataset Summary:")
print(f"- Total annotations       : {total_annotations}")
print(f"- Unique image files      : {unique_images}")
print(f"- Number of food classes  : {unique_classes}")
Dataset Summary:
- Total annotations       : 513
- Unique image files      : 513
- Number of food classes  : 10

Observation :

  • The dictionary maps each food class name to a unique integer index from 0 to 9, following the alphabetical order of class names.

  • Class names like 'Apple Pie' and 'Chocolate' come first as they are alphabetically earlier.

  • The mapping is case-sensitive and sorted lexicographically, so lowercase names like 'onion_rings', 'pancakes', 'spring_rolls', and 'tacos' appear after the capitalized ones due to ASCII sorting rules.

  • This consistent and reproducible mapping is essential for:

    • Encoding labels during model training.

    • Decoding predictions back to readable class names.

  • With 10 classes total, this dictionary covers all classes with unique indices and no duplicates or missing entries.

In [297]:
# Function to display bounding boxes for specified classes

def show_bboxes(df, n=5, classes_to_show=None):
    # Filter by class if specified
    if classes_to_show:
        #classes_to_show = [cls.lower().replace(" ", "_") for cls in classes_to_show]
        #df['class'] = df['class'].str.lower()
        filtered_df = df[df['class'].isin(classes_to_show)]
        if filtered_df.empty:
            print(f"⚠️ No images found for classes: {classes_to_show}")
            return
    else:
        filtered_df = df

    img_files = filtered_df['filename'].unique()
    total = min(n, len(img_files))

    # Prepare grid layout (e.g., 5 images in 1 row)
    fig, axes = plt.subplots(1, total, figsize=(5 * total, 5))

    # If only one image, axes is not iterable
    if total == 1:
        axes = [axes]

    for idx in range(total):
        img_file = img_files[idx]
        img_path = os.path.join(img_folder, img_file)
        if not os.path.exists(img_path):
            print(f"❌ Image not found: {img_path}")
            continue

        img = cv2.imread(img_path)
        if img is None:
            print(f"⚠️ Unable to read image: {img_file}")
            continue

        img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        ax = axes[idx]
        ax.imshow(img_rgb)

        # Draw all boxes for the current image
        for _, row in filtered_df[filtered_df['filename'] == img_file].iterrows():
            x_min, y_min, x_max, y_max = int(row['xmin']), int(row['ymin']), int(row['xmax']), int(row['ymax'])
            label = row['class']
            rect = patches.Rectangle((x_min, y_min), x_max - x_min, y_max - y_min,
                                     linewidth=2, edgecolor='red', facecolor='none')
            ax.add_patch(rect)
            ax.text(x_min, y_min - 5, label, color='red', fontsize=10, backgroundcolor='white')

        ax.axis('off')
        ax.set_title(f"{img_file}", fontsize=10)

    plt.tight_layout()
    plt.show()
In [299]:
# Show 5 images with boxes only for 'Apple Pie'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Apple Pie'])
No description has been provided for this image
In [300]:
# Show 5 images with boxes only for 'French Fries'
show_bboxes(food_annotations_df, n=5, classes_to_show=['French Fries'])
No description has been provided for this image
In [29]:
# Show 5 images with boxes only for 'pancakes'
show_bboxes(food_annotations_df, n=5, classes_to_show=['pancakes'])
No description has been provided for this image
In [30]:
# Show 5 images with boxes only for 'tacos'
show_bboxes(food_annotations_df, n=5, classes_to_show=['tacos'])
No description has been provided for this image
In [31]:
# Show 5 images with boxes only for 'Pizza'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Pizza'])
No description has been provided for this image
In [32]:
# Show 5 images with boxes only for 'Nachos'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Nachos'])
No description has been provided for this image
In [33]:
# Show 5 images with boxes only for 'onion_rings'
show_bboxes(food_annotations_df, n=5, classes_to_show=['onion_rings'])
No description has been provided for this image
In [34]:
# Show 5 images with boxes only for 'spring_rolls'
show_bboxes(food_annotations_df, n=5, classes_to_show=['spring_rolls'])
No description has been provided for this image
In [ ]:
# Show 5 images with boxes only for 'hot_dog'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Hotdog'])
No description has been provided for this image
In [39]:
# Show 5 images with boxes only for 'chocolate_cake'
show_bboxes(food_annotations_df, n=5, classes_to_show=['Chocolate'])
No description has been provided for this image

Step 5: Design, train and test basic CNN models to classify the flood.¶

Utilities Functions¶

In [1334]:
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping

def train_model(model, X_train, y_train, X_val, y_val, epochs=50, batch_size=32, filepath='model_best.weights.h5'):
    checkpointer = ModelCheckpoint(
        filepath=filepath,
        verbose=1,
        save_best_only=True,
        save_weights_only=True
    )

    earlystopping = EarlyStopping(
        monitor='val_loss',
        min_delta=0.01,
        patience=20,
        mode='auto'
    )

    reduceLR = ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=10,
        mode='auto'
    )

    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=epochs,
        batch_size=batch_size,
        callbacks=[checkpointer, reduceLR, earlystopping],
        verbose=1
    )
    
    return history
In [1433]:
import matplotlib.pyplot as plt
import numpy as np

def plot_training_history(history, model, X_test, y_test=None, model_name="Model"):
    """
    Plot training and validation metrics from model history,
    and evaluate accuracy/loss on test data.
    
    Args:
        history: History object returned from model.fit()
        model: Trained Keras model
        X_test: Test feature set
        y_test: Test labels
        model_name: Name of the model for the plot title
    """
    # Create figure with two subplots
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
    
    # Plot accuracy
    ax1.plot(history.history['accuracy'], label='Training Accuracy')
    ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')
    ax1.set_title(f'{model_name} - Accuracy')
    ax1.set_xlabel('Epoch')
    ax1.set_ylabel('Accuracy')
    ax1.legend()
    ax1.grid(True)
    
    # Plot loss
    ax2.plot(history.history['loss'], label='Training Loss')
    ax2.plot(history.history['val_loss'], label='Validation Loss')
    ax2.set_title(f'{model_name} - Loss')
    ax2.set_xlabel('Epoch')
    ax2.set_ylabel('Loss')
    ax2.legend()
    ax2.grid(True)
    
    plt.tight_layout()
    plt.show()
    
    # Print final training/validation metrics
    print(f"\n🔍 Final Epoch Metrics:")
    print(f"📈 Training Accuracy     : {history.history['accuracy'][-1]:.2f}")
    print(f"📉 Training Loss         : {history.history['loss'][-1]:.2f}")
    print(f"📈 Validation Accuracy   : {history.history['val_accuracy'][-1]:.2f}")
    print(f"📉 Validation Loss       : {history.history['val_loss'][-1]:.4f}")

    
    
    # Evaluate on test data
    test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=0)
    print(f"\n🧪 Test Accuracy         : {test_accuracy:.2f}")
    print(f"🧪 Test Loss             : {test_loss:.2f}")
In [796]:
import numpy as np
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
from pandas import DataFrame

def evaluate_classification_model(model, X_test, y_test, y_train=None):
    """
    Evaluate a classification model: prints classification report and shows confusion matrix.

    Parameters:
    - model: Trained Keras model
    - X_test: Test features
    - y_test: True labels (can be one-hot or class indices)
    - y_train: (Optional) Training labels to ensure LabelEncoder covers all classes
    """

    # Ensure X_test is a NumPy array with dtype float32
    X_test = np.array(X_test).astype(np.float32)  # 🔧 Fix applied here

    # Predict class probabilities
    y_pred_probs = model.predict(X_test)

    # Get predicted class indices
    y_pred_class = np.argmax(y_pred_probs, axis=1)

    # Convert y_test to class indices if one-hot encoded
    if y_test.ndim > 1 and y_test.shape[1] > 1:
        y_test_class = np.argmax(y_test, axis=1)
    else:
        y_test_class = y_test.ravel().astype(int)

    # Fit LabelEncoder on combined labels if y_train is provided
    if y_train is not None:
        all_labels = np.concatenate([y_train.ravel(), y_test_class])
    else:
        all_labels = y_test_class

    label_encoder = LabelEncoder()
    label_encoder.fit(all_labels)

    # Decode predicted and true labels to class names
    y_test_labels = label_encoder.inverse_transform(y_test_class.astype(int))
    y_pred_labels = label_encoder.inverse_transform(y_pred_class.astype(int))
    class_names = sorted(food_annotations_df['class'].unique())

    # Print classification report
    print("Classification Report:")
    print(classification_report(y_test_labels, y_pred_labels, target_names=class_names, zero_division=0))

    # Confusion Matrix
    conf_mat = confusion_matrix(y_test_class, y_pred_class, labels=label_encoder.classes_)
  
    # Plot Confusion Matrix
    plt.figure(figsize=(10, 8))
    sns.heatmap(conf_mat, annot=True, fmt='d', xticklabels=class_names, yticklabels=class_names, cmap='Blues')
    plt.xlabel('Predicted')
    plt.ylabel('Actual')
    plt.title('Confusion Matrix')
    plt.show()
In [1382]:
import random
import matplotlib.pyplot as plt
import numpy as np

def plot_random_predictions(X_test, y_test, class_names, model, num_samples=5):
    """
    Plots random test samples with predicted and actual labels, showing 5 images per row max.
    Correct predictions are shown in green, incorrect in red.

    Args:
        X_test (np.array): Test images, shape (N, H, W, C)
        y_test (np.array): One-hot encoded labels, shape (N, num_classes)
        class_names (list): List of class names corresponding to label indices
        model (keras.Model): Trained classification model
        num_samples (int): Number of random samples to display (default: 5)
    """
    indices = random.sample(range(len(X_test)), num_samples)
    
    cols = 5
    rows = (num_samples + cols - 1) // cols  # Ceiling division to get rows

    plt.figure(figsize=(cols * 3, rows * 3))  # Adjust figure size

    for i, idx in enumerate(indices):
        img = X_test[idx]
        true_label = np.argmax(y_test[idx])
        pred_label = np.argmax(model.predict(np.expand_dims(img, axis=0), verbose=0))

        color = 'green' if pred_label == true_label else 'red'
        title_text = f"Pred: {class_names[pred_label]}\nActual: {class_names[true_label]}"

        plt.subplot(rows, cols, i + 1)
        plt.imshow(img)
        plt.title(title_text, color=color, fontsize=10)
        plt.axis('off')

    plt.suptitle("Model Predictions on Random Test Images", fontsize=16)
    plt.tight_layout()
    plt.subplots_adjust(top=0.85)  # Make space for suptitle
    plt.show()

Step 5.1 Build Basic CNN 1 (Improved CNN)¶

Step 5.1.1:Preprocess Data¶

In [1075]:
# Import train_test_split to split data into training and testing sets with optional stratification
from sklearn.model_selection import train_test_split

# Import to_categorical to convert integer labels into one-hot encoded format for classification models
from tensorflow.keras.utils import to_categorical

# Import img_to_array to convert PIL Images or numpy arrays to proper array format for model input
from tensorflow.keras.preprocessing.image import img_to_array
In [1215]:
# Extract all unique food class names from the 'class' column in the annotations DataFrame,
# then sort them alphabetically to create a consistent ordered list of class names
class_names = sorted(food_annotations_df['class'].unique())
In [1216]:
class_names
Out[1216]:
['Apple Pie',
 'Chocolate',
 'French Fries',
 'Hotdog',
 'Nachos',
 'Pizza',
 'onion_rings',
 'pancakes',
 'spring_rolls',
 'tacos']
In [1217]:
# Create a dictionary mapping each class name to a unique integer index,
# where indices correspond to the position of the class name in the sorted list
class_to_idx = {cls: idx for idx, cls in enumerate(class_names)}
In [1218]:
class_to_idx
Out[1218]:
{'Apple Pie': 0,
 'Chocolate': 1,
 'French Fries': 2,
 'Hotdog': 3,
 'Nachos': 4,
 'Pizza': 5,
 'onion_rings': 6,
 'pancakes': 7,
 'spring_rolls': 8,
 'tacos': 9}
In [1219]:
# Encode class labels
food_annotations_df['label'] = food_annotations_df['class'].map(class_to_idx)
In [1220]:
food_annotations_df
Out[1220]:
filename width height class xmin ymin xmax ymax label
0 2909830_jpg.rf.bb9125215f38f22139f72d04f19e693... 512 512 Apple Pie 210 43 397 259 0
1 108743_jpg.rf.260978b4f8ae78f4ebb41f48ef501679... 512 384 French Fries 50 3 442 383 2
2 149278_jpg.rf.86187fd5bd1698133cb7a973c6060449... 512 384 French Fries 33 0 260 167 2
3 2986199_jpg.rf.ac0b99e71100520e6608ef72b12ee27... 512 512 Apple Pie 28 37 291 233 0
4 2934928_jpg.rf.c8f427a0d3e7ba9342fe37276fb15ab... 512 512 Apple Pie 9 54 463 465 0
... ... ... ... ... ... ... ... ... ...
508 30292-hotdog_jpg.rf.0390f5521fb9e6e7e3acb2a6a8... 640 640 Hotdog 56 0 640 605 3
509 14043-hotdog_jpg.rf.8336579be067ac62410422f411... 640 640 Hotdog 48 124 404 526 3
510 8006-hotdog_jpg.rf.2b5a43d73a7b80e624c778536e2... 640 640 Hotdog 65 45 623 640 3
511 4345-hotdog_jpg.rf.c81f7d5ae5388487ceea9df4709... 640 640 Hotdog 2 8 640 640 3
512 51643-hotdog_jpg.rf.2eeb177096d2f26e6f38322d53... 640 640 Hotdog 3 26 483 607 3

513 rows × 9 columns

  • A new column 'label' is added to the food_annotations_df DataFrame, where each food class name in the 'class' column is replaced with its corresponding integer index from the class_to_idx dictionary. This numeric encoding is necessary for training machine learning models that require labels as integers.
In [1253]:
# --- Load images and corresponding labels ---
img_folder = 'Datasetv1/original_images/'
images = []
labels = []

for _, row in food_annotations_df.iterrows():
    img_path = os.path.join(img_folder, row['filename'])
    img = cv2.imread(img_path)

    if img is not None:
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # Convert from BGR to RGB
        img = cv2.resize(img, (128, 128))           # Resize to 128x128
        #img = img_to_array(img) / 255.0             # Normalize to [0, 1]
        images.append(img)
        labels.append(row['class'])
In [1287]:
# --- Convert lists of images and labels to NumPy arrays ---
X = np.array(images)
y = np.array(labels)

# Display the shapes of the feature and label arrays
print(f"Shape of image data (X): {X.shape}")
print(f"Shape of label data (y): {y.shape}")
Shape of image data (X): (513, 128, 128, 3)
Shape of label data (y): (513,)
In [1288]:
y
Out[1288]:
array(['Apple Pie', 'French Fries', 'French Fries', 'Apple Pie',
       'Apple Pie', 'French Fries', 'Apple Pie', 'Apple Pie',
       'French Fries', 'Apple Pie', 'French Fries', 'French Fries',
       'Apple Pie', 'Apple Pie', 'Apple Pie', 'French Fries', 'Apple Pie',
       'French Fries', 'French Fries', 'French Fries', 'French Fries',
       'Apple Pie', 'French Fries', 'French Fries', 'French Fries',
       'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
       'French Fries', 'French Fries', 'Apple Pie', 'Apple Pie',
       'Apple Pie', 'Apple Pie', 'French Fries', 'Apple Pie',
       'French Fries', 'Apple Pie', 'French Fries', 'Apple Pie',
       'French Fries', 'French Fries', 'French Fries', 'French Fries',
       'French Fries', 'French Fries', 'Apple Pie', 'Apple Pie',
       'Apple Pie', 'French Fries', 'Apple Pie', 'French Fries',
       'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
       'Apple Pie', 'Apple Pie', 'Apple Pie', 'Apple Pie', 'Apple Pie',
       'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
       'French Fries', 'Apple Pie', 'Apple Pie', 'French Fries',
       'French Fries', 'Apple Pie', 'Apple Pie', 'French Fries',
       'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
       'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
       'Apple Pie', 'French Fries', 'Apple Pie', 'French Fries',
       'French Fries', 'French Fries', 'French Fries', 'Apple Pie',
       'French Fries', 'Apple Pie', 'French Fries', 'Apple Pie',
       'French Fries', 'Apple Pie', 'Apple Pie', 'Apple Pie',
       'French Fries', 'Apple Pie', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'pancakes',
       'pancakes', 'pancakes', 'pancakes', 'pancakes', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos', 'tacos',
       'tacos', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza', 'Pizza',
       'Pizza', 'Pizza', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos', 'Nachos',
       'Nachos', 'Nachos', 'Nachos', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'onion_rings', 'onion_rings', 'onion_rings',
       'onion_rings', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'spring_rolls', 'spring_rolls',
       'spring_rolls', 'spring_rolls', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate', 'Chocolate',
       'Chocolate', 'Chocolate', 'Chocolate', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog',
       'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog', 'Hotdog'],
      dtype='<U12')
In [1090]:
# --- Convert to NumPy arrays ---
#X = np.array(images)
#y = to_categorical(labels, num_classes=len(class_names))  # One-hot encode the labels
  • Verifying an image and its label after splitting into X (images) and y (target labels), and your y contains target labels
In [1291]:
import matplotlib.pyplot as plt
import random

# Number of images to display
num_display = 5

# Randomly pick image indices
indices = random.sample(range(len(images)), num_display)

plt.figure(figsize=(15, 5))

for i, idx in enumerate(indices):
    plt.subplot(1, num_display, i + 1)
    plt.imshow(images[idx])
    class_name = y[idx]  # Get class name using label index
    plt.title(f"Label: {class_name}")
    plt.axis('off')

plt.suptitle("Sample Images with Labels", fontsize=16)
plt.tight_layout()
plt.show()
No description has been provided for this image
In [1293]:
# Encode labels to integers first
from sklearn.preprocessing import LabelEncoder
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)
# print summary
print("Labels encoded successfully.")
print(f"Number of classes: {len(label_encoder.classes_)}")
Labels encoded successfully.
Number of classes: 10
In [1294]:
# Get all unique class labels (original) and their encoded values
class_names = label_encoder.classes_

print("Label Mapping (Original Label → Encoded Index):")
for idx, label in enumerate(class_names):
    print(f"{idx}: {label}")
Label Mapping (Original Label → Encoded Index):
0: Apple Pie
1: Chocolate
2: French Fries
3: Hotdog
4: Nachos
5: Pizza
6: onion_rings
7: pancakes
8: spring_rolls
9: tacos

Train Test Split:¶

In [1298]:
# Split into train and temp sets (80% train, 20% temp), with stratification
X_train, X_temp, y_train_encoded, y_temp_encoded = train_test_split(
    X, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)

# Split temp into validation and test (each 10% of total), with stratification
X_valid, X_test, y_valid_encoded, y_test_encoded = train_test_split(
    X_temp, y_temp_encoded, test_size=0.5, random_state=42, stratify=y_temp_encoded
)

# One-hot encode the labels
y_train = to_categorical(y_train_encoded)
y_valid = to_categorical(y_valid_encoded)
y_test = to_categorical(y_test_encoded)


# Print the shapes of the splits
print("Dataset Split Summary:")
print(f"Train set    → X: {X_train.shape}, y: {y_train.shape}")
print(f"Validation   → X: {X_valid.shape}, y: {y_valid.shape}")
print(f"Test set     → X: {X_test.shape}, y: {y_test.shape}")
Dataset Split Summary:
Train set    → X: (410, 128, 128, 3), y: (410, 10)
Validation   → X: (51, 128, 128, 3), y: (51, 10)
Test set     → X: (52, 128, 128, 3), y: (52, 10)

Verify image-label mapping after splitting¶

In [1335]:
print(label_encoder.classes_)
['Apple Pie' 'Chocolate' 'French Fries' 'Hotdog' 'Nachos' 'Pizza'
 'onion_rings' 'pancakes' 'spring_rolls' 'tacos']
In [1310]:
np.argmax(y_train[1])
Out[1310]:
7
In [1312]:
# ------------------------------
# Display a Random Training Image with its Label
# ------------------------------

def show_samples(X, y, class_names, num_samples=5):
    plt.figure(figsize=(15, 5))

    for i in range(num_samples):
        img = X[i]
        label_idx = np.argmax(y[i])  # Convert one-hot label to index

        plt.subplot(1, num_samples, i + 1)
        plt.imshow(img)
        plt.title(f"Label: {class_names[label_idx]}")
        plt.axis('off')

    plt.suptitle("Sample Training Images with Labels", fontsize=16)
    plt.tight_layout()
    plt.show()

# Call the function
show_samples(X_train, y_train, label_encoder.classes_)
No description has been provided for this image
In [1313]:
# Check lengths
print(len(X_train), len(y_train))  # Should be equal
print(len(X_test), len(y_test))    # Should be equal
410 410
52 52
In [1314]:
# Check label distribution consistency

import numpy as np
import collections

# Convert one-hot encoded labels to class indices
y_train_labels = np.argmax(y_train, axis=1)
y_test_labels = np.argmax(y_test, axis=1)

# Count label distribution
print("Train label distribution:", collections.Counter(y_train_labels))
print("Test label distribution:", collections.Counter(y_test_labels))
Train label distribution: Counter({7: 45, 8: 42, 9: 42, 6: 41, 2: 41, 4: 40, 1: 40, 3: 40, 5: 40, 0: 39})
Test label distribution: Counter({7: 6, 8: 6, 4: 5, 6: 5, 2: 5, 3: 5, 1: 5, 9: 5, 0: 5, 5: 5})
In [1315]:
import matplotlib.pyplot as plt

# Count label distribution
train_counts = collections.Counter(y_train_labels)
test_counts = collections.Counter(y_test_labels)

# Sort labels for consistent plotting
labels = sorted(train_counts.keys())

# Get counts in sorted order
train_values = [train_counts[label] for label in labels]
test_values = [test_counts[label] for label in labels]

# Plotting
x = np.arange(len(labels))
width = 0.35

plt.figure(figsize=(12, 6))
plt.bar(x - width/2, train_values, width, label='Train', color='skyblue')
plt.bar(x + width/2, test_values, width, label='Test', color='salmon')
plt.xlabel('Class Label')
plt.ylabel('Number of Samples')
plt.title('Train vs Test Label Distribution')
plt.xticks(x, labels)
plt.legend()
plt.tight_layout()
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
No description has been provided for this image

Observation :

  • Dataset is fairly balanced, which is beneficial for model training, as it reduces the risk of bias toward any particular class.
In [1316]:
# ------------------------------
# Display a Random Train Image with its Label
# ------------------------------

import random
import numpy as np
import matplotlib.pyplot as plt

# Pick a random index from the training set
idx = random.randint(0, len(X_train) - 1)

# Convert one-hot encoded label at that index to an integer class index
label_idx = np.argmax(y_train[idx])

# Print the label index to verify which class the image belongs to
print("Label index:", label_idx)

# Display the image at the randomly selected index
plt.imshow(X_train[idx])

# Set the title of the plot to the corresponding class name
plt.title(class_names[label_idx])

# Remove axis ticks and labels for a cleaner display
plt.axis('off')

# Show the image plot
plt.show()
Label index: 2
No description has been provided for this image
In [1318]:
# ------------------------------
# Display a Random Test Image with its Label
# ------------------------------

import random
import numpy as np
import matplotlib.pyplot as plt

# Pick a random index from the training set
idx = random.randint(0, len(X_test) - 1)

# Convert one-hot encoded label at that index to an integer class index
label_idx = np.argmax(y_test[idx])

# Print the label index to verify which class the image belongs to
print("Label index:", label_idx)

# Display the image at the randomly selected index
plt.imshow(X_test[idx])

# Set the title of the plot to the corresponding class name
plt.title(class_names[label_idx])

# Remove axis ticks and labels for a cleaner display
plt.axis('off')

# Show the image plot
plt.show()
Label index: 6
No description has been provided for this image

Step 5.1.2: Build a Basic CNN¶

In [1329]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.layers import Input
from tensorflow.keras.optimizers import Adam

# Define a simple CNN model for multi-class classification
basic_cnn_model_1 = Sequential([
    Input(shape=(128, 128, 3)),  # Input layer specifying image size and channels (RGB)

    # First convolution + pooling block
    Conv2D(32, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),

    # Second convolution + pooling block
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),

    # Third convolution + pooling block
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),

    # Flatten feature maps to a 1D vector for dense layers
    Flatten(),

    # Fully connected layer with 128 neurons
    Dense(128, activation='relu'),
    Dropout(0.5), # Dropout for regularization to prevent overfitting

    # Output layer with number of classes and softmax activation
    Dense(len(class_names), activation='softmax')
])

# Compile the model with Adam optimizer, categorical crossentropy loss for multi-class, and accuracy metric
basic_cnn_model_1.compile(
    optimizer=Adam(learning_rate=0.001),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Print model architecture summary
basic_cnn_model_1.summary()
Model: "sequential_67"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_260 (Conv2D)             │ (None, 126, 126, 32)   │           896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_253               │ (None, 63, 63, 32)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_261 (Conv2D)             │ (None, 61, 61, 64)     │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_254               │ (None, 30, 30, 64)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_262 (Conv2D)             │ (None, 28, 28, 128)    │        73,856 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_255               │ (None, 14, 14, 128)    │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_50 (Flatten)            │ (None, 25088)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_135 (Dense)               │ (None, 128)            │     3,211,392 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_182 (Dropout)           │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_136 (Dense)               │ (None, 10)             │         1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 3,305,930 (12.61 MB)
 Trainable params: 3,305,930 (12.61 MB)
 Non-trainable params: 0 (0.00 B)

5.1.3: Train the Model¶

In [1336]:
# Train the model
# history = basic_cnn_model_1.fit(
#     X_train,              # Training input features (e.g., images, text sequences)
#     y_train,
#     validation_data=(X_val, y_val),                            # Corresponding training labels
#     epochs=10,            # Number of times the model will iterate over the entire training data
#     batch_size=16,        # Number of samples the model processes before updating weights
#     #validation_split=0.2  # Fraction of training data (20%) used for validation (i.e., 80% train, 20% validate)
# )

history = train_model(basic_cnn_model_1, X_train, y_train, X_valid, y_valid, epochs=20, batch_size=16)
Epoch 1/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 57ms/step - accuracy: 0.2579 - loss: 2.0964
Epoch 1: val_loss improved from inf to 2.23658, saving model to model_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 63ms/step - accuracy: 0.2568 - loss: 2.0971 - val_accuracy: 0.1765 - val_loss: 2.2366 - learning_rate: 2.5000e-04
Epoch 2/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.3203 - loss: 1.9839
Epoch 2: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 53ms/step - accuracy: 0.3204 - loss: 1.9822 - val_accuracy: 0.1569 - val_loss: 2.2678 - learning_rate: 2.5000e-04
Epoch 3/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 51ms/step - accuracy: 0.4405 - loss: 1.7554
Epoch 3: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 53ms/step - accuracy: 0.4404 - loss: 1.7515 - val_accuracy: 0.1569 - val_loss: 2.3056 - learning_rate: 2.5000e-04
Epoch 4/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step - accuracy: 0.4877 - loss: 1.5367
Epoch 4: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 56ms/step - accuracy: 0.4904 - loss: 1.5339 - val_accuracy: 0.1373 - val_loss: 2.3733 - learning_rate: 2.5000e-04
Epoch 5/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.6255 - loss: 1.2175
Epoch 5: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.6248 - loss: 1.2186 - val_accuracy: 0.1569 - val_loss: 2.4882 - learning_rate: 2.5000e-04
Epoch 6/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.6634 - loss: 1.0593
Epoch 6: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.6643 - loss: 1.0581 - val_accuracy: 0.2353 - val_loss: 2.7572 - learning_rate: 2.5000e-04
Epoch 7/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.7401 - loss: 0.8608
Epoch 7: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step - accuracy: 0.7395 - loss: 0.8611 - val_accuracy: 0.1961 - val_loss: 3.0568 - learning_rate: 2.5000e-04
Epoch 8/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.8043 - loss: 0.7184
Epoch 8: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8060 - loss: 0.7156 - val_accuracy: 0.1569 - val_loss: 3.6817 - learning_rate: 2.5000e-04
Epoch 9/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.8389 - loss: 0.6055
Epoch 9: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8388 - loss: 0.6054 - val_accuracy: 0.1373 - val_loss: 3.5692 - learning_rate: 2.5000e-04
Epoch 10/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 53ms/step - accuracy: 0.8842 - loss: 0.4877
Epoch 10: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 55ms/step - accuracy: 0.8846 - loss: 0.4838 - val_accuracy: 0.1373 - val_loss: 4.1033 - learning_rate: 2.5000e-04
Epoch 11/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.8958 - loss: 0.3359
Epoch 11: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.8956 - loss: 0.3397 - val_accuracy: 0.1765 - val_loss: 3.5208 - learning_rate: 2.5000e-04
Epoch 12/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9099 - loss: 0.3405
Epoch 12: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9101 - loss: 0.3392 - val_accuracy: 0.1765 - val_loss: 3.7455 - learning_rate: 1.2500e-04
Epoch 13/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9513 - loss: 0.2061
Epoch 13: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9506 - loss: 0.2076 - val_accuracy: 0.0980 - val_loss: 4.4924 - learning_rate: 1.2500e-04
Epoch 14/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9460 - loss: 0.2314
Epoch 14: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9460 - loss: 0.2312 - val_accuracy: 0.1373 - val_loss: 5.0421 - learning_rate: 1.2500e-04
Epoch 15/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9699 - loss: 0.1522
Epoch 15: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9698 - loss: 0.1532 - val_accuracy: 0.0784 - val_loss: 4.6256 - learning_rate: 1.2500e-04
Epoch 16/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 48ms/step - accuracy: 0.9668 - loss: 0.1342
Epoch 16: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9665 - loss: 0.1352 - val_accuracy: 0.1176 - val_loss: 4.5482 - learning_rate: 1.2500e-04
Epoch 17/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9572 - loss: 0.1650
Epoch 17: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9577 - loss: 0.1632 - val_accuracy: 0.1176 - val_loss: 4.9174 - learning_rate: 1.2500e-04
Epoch 18/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9586 - loss: 0.1724
Epoch 18: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 51ms/step - accuracy: 0.9586 - loss: 0.1723 - val_accuracy: 0.1176 - val_loss: 4.3938 - learning_rate: 1.2500e-04
Epoch 19/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step - accuracy: 0.9883 - loss: 0.0975
Epoch 19: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 52ms/step - accuracy: 0.9881 - loss: 0.0966 - val_accuracy: 0.1765 - val_loss: 5.4650 - learning_rate: 1.2500e-04
Epoch 20/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 49ms/step - accuracy: 0.9696 - loss: 0.1042
Epoch 20: val_loss did not improve from 2.23658
26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step - accuracy: 0.9703 - loss: 0.1042 - val_accuracy: 0.1765 - val_loss: 5.5215 - learning_rate: 1.2500e-04

Model Evaluation & Visualization¶

In [1337]:
# Plot training history and evaluate the model on test data
# ---------------------------------------------------------
# history               : The training history object returned by model.fit(), containing loss and accuracy over epochs
# basic_cnn_model_1     : The trained Keras model to be evaluated
# X_test, y_test        : Test dataset used to evaluate model performance after training
# model_name            : (Optional) Custom name for title/labeling plots and saving figures

plot_training_history(history, basic_cnn_model_1, X_test, y_test, model_name="Basic CNN 1")
No description has been provided for this image
🔍 Final Epoch Metrics:
📈 Training Accuracy     : 0.98
📉 Training Loss         : 0.10
📈 Validation Accuracy   : 0.18
📉 Validation Loss       : 5.5215

🧪 Test Accuracy         : 0.23
🧪 Test Loss             : 4.43

Based on the final epoch metrics, test performance, and epoch-wise training logs, here are detailed observations and insights about your model's training behavior:

Large gap between training and validation/test accuracy and diverging loss values indicate overfitting. Model memorizes training data but fails to generalize.


Observation:

  • Training accuracy steadily improves.
  • Validation accuracy stagnates (~39%) and validation loss increases after epoch 5-6, showing early overfitting.
  • Best generalization observed around epochs 5-6

Conclusion:
Poor generalization on unseen data confirmed by low test accuracy and high test loss.


Summary of Issues

Problem Evidence
Overfitting High train acc vs low val/test acc
Poor generalization Test accuracy and loss worse than validation
Validation loss rise Val loss increases after epoch 5 while train loss decreases
Model complexity Model fits training data too well too quickly
In [1338]:
# Evaluate the trained classification model on the test set
# ----------------------------------------------------------
# basic_cnn_model_1     : The trained Keras model that will be evaluated
# X_test                : Test feature data (e.g., images) for model prediction
# y_test                : True labels (can be one-hot encoded or class indices) for evaluating predictions
# y_train               : Optional — training labels used to fit LabelEncoder on all classes (helps preserve class label mapping)

evaluate_classification_model(basic_cnn_model_1, X_test, y_test, y_train=y_train)
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 50ms/step
Classification Report:
              precision    recall  f1-score   support

   Apple Pie       0.12      0.20      0.15         5
   Chocolate       0.20      0.20      0.20         5
French Fries       0.50      0.40      0.44         5
      Hotdog       0.00      0.00      0.00         5
      Nachos       0.00      0.00      0.00         5
       Pizza       0.33      0.20      0.25         5
 onion_rings       0.43      0.60      0.50         5
    pancakes       0.33      0.33      0.33         6
spring_rolls       0.20      0.17      0.18         6
       tacos       0.17      0.20      0.18         5

    accuracy                           0.23        52
   macro avg       0.23      0.23      0.22        52
weighted avg       0.23      0.23      0.23        52

No description has been provided for this image
In [1384]:
# Visualize predictions on random test images
# Arguments:
# - X_test              : array of test images (preprocessed, shape like (N, H, W, C))
# - y_test              : one-hot encoded true labels for test images
# - class_names         : list of class label names corresponding to indices
# - basic_cnn_model_1   : trained classification model
# - num_samples         : number of random samples to display (default is 5)

plot_random_predictions(X_test, y_test, class_names, basic_cnn_model_1, num_samples=20)
No description has been provided for this image

Classification Report Summary:

  • Overall accuracy is low (~28%), showing the model struggles with correct predictions.
  • Most classes have poor precision, recall, and F1-scores; the highest F1 is ~0.52 (Chocolate).
  • Some classes (e.g., Hotdog) have zero precision and recall, meaning no correct predictions.
  • Classes like Nachos show higher recall but low precision, indicating many false positives.
  • The model likely underfits or lacks discriminative features.
  • Recommendations:
    • Increase dataset size or balance classes.
    • Apply data augmentation.
    • Use class weighting or sampling strategies.
    • Tune or try more powerful models (e.g., transfer learning).

Observations on Confusion Matrix

This confusion matrix shows how well a model predicts food items. Here are some key points:

  1. Good Predictions:

    • The model is best at predicting "spring_rolls" (10 correct) and "tacos" (8 correct).
    • "chocolate" (7 correct), "nachos" (7 correct), and "pancakes" (6 correct) are also predicted well.
  2. Common Mistakes:

    • "Apple Pie" is often confused with "pizza" (4 times) and "tacos" (4 times).
    • "French Fries" is mistaken for "tacos" (3 times).
    • "Hotdog" is confused with "tacos" (5 times).
    • "onion_rings" is often predicted as "french_fries" (6 times).
  3. Overall Performance:

    • The model struggles with "Apple Pie" and "onion_rings" the most, as they have low correct predictions (2 and 4).
    • Some classes like "pizza" and "hotdog" are confused with multiple other classes, showing the model has trouble distinguishing them.

The model does well for some foods but mixes up others, especially with "tacos" and "french_fries." It needs improvement for better accuracy.

Step 5.2 Build Basic CNN 2 (Improved CNN)¶

In [1370]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dense, Dropout, BatchNormalization, Input
from tensorflow.keras.regularizers import l2
from tensorflow.keras.optimizers import Adam

basic_cnn_model_2 = Sequential([
    Input(shape=(128, 128, 3)),

    Conv2D(32, (3, 3), activation='relu', padding='same'),
    BatchNormalization(),
    MaxPooling2D((2, 2)),

    Conv2D(64, (3, 3), activation='relu', padding='same'),
    BatchNormalization(),
    MaxPooling2D((2, 2)),

    Conv2D(64, (3, 3), activation='relu', padding='same'),
    BatchNormalization(),
    MaxPooling2D((2, 2)),

    GlobalAveragePooling2D(),

    Dense(128, activation='relu', kernel_regularizer=l2(0.001)),
    Dropout(0.5),

    Dense(len(class_names), activation='softmax')
])

basic_cnn_model_2.compile(optimizer=Adam(learning_rate=1e-4),
                          loss='categorical_crossentropy',
                          metrics=['accuracy'])

basic_cnn_model_2.summary()
Model: "sequential_73"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_278 (Conv2D)             │ (None, 128, 128, 32)   │           896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_176         │ (None, 128, 128, 32)   │           128 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_271               │ (None, 64, 64, 32)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_279 (Conv2D)             │ (None, 64, 64, 64)     │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_177         │ (None, 64, 64, 64)     │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_272               │ (None, 32, 32, 64)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_280 (Conv2D)             │ (None, 32, 32, 64)     │        36,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_178         │ (None, 32, 32, 64)     │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_273               │ (None, 16, 16, 64)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ global_average_pooling2d_21     │ (None, 64)             │             0 │
│ (GlobalAveragePooling2D)        │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_147 (Dense)               │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_191 (Dropout)           │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_148 (Dense)               │ (None, 10)             │         1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 66,570 (260.04 KB)
 Trainable params: 66,250 (258.79 KB)
 Non-trainable params: 320 (1.25 KB)
In [ ]:
basic_cnn_model_2_history = train_model(basic_cnn_model_2, X_train, y_train, X_valid, y_valid, epochs=20, batch_size=16, filepath='basic_cnn_model_2_best.weights.h5')
Epoch 1/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.0916 - loss: 2.6087
Epoch 1: val_loss improved from inf to 2.74920, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 3s 77ms/step - accuracy: 0.0918 - loss: 2.6050 - val_accuracy: 0.1373 - val_loss: 2.7492 - learning_rate: 1.0000e-04
Epoch 2/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.0902 - loss: 2.5299
Epoch 2: val_loss improved from 2.74920 to 2.50387, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.0918 - loss: 2.5261 - val_accuracy: 0.1373 - val_loss: 2.5039 - learning_rate: 1.0000e-04
Epoch 3/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 75ms/step - accuracy: 0.1166 - loss: 2.4303
Epoch 3: val_loss improved from 2.50387 to 2.43262, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 77ms/step - accuracy: 0.1167 - loss: 2.4307 - val_accuracy: 0.0980 - val_loss: 2.4326 - learning_rate: 1.0000e-04
Epoch 4/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1419 - loss: 2.4111
Epoch 4: val_loss improved from 2.43262 to 2.38982, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1433 - loss: 2.4101 - val_accuracy: 0.1176 - val_loss: 2.3898 - learning_rate: 1.0000e-04
Epoch 5/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1307 - loss: 2.4276
Epoch 5: val_loss improved from 2.38982 to 2.36039, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1311 - loss: 2.4269 - val_accuracy: 0.1176 - val_loss: 2.3604 - learning_rate: 1.0000e-04
Epoch 6/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.1826 - loss: 2.2419
Epoch 6: val_loss improved from 2.36039 to 2.35024, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.1820 - loss: 2.2447 - val_accuracy: 0.1569 - val_loss: 2.3502 - learning_rate: 1.0000e-04
Epoch 7/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 83ms/step - accuracy: 0.1655 - loss: 2.2877
Epoch 7: val_loss improved from 2.35024 to 2.33138, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 86ms/step - accuracy: 0.1645 - loss: 2.2910 - val_accuracy: 0.2549 - val_loss: 2.3314 - learning_rate: 1.0000e-04
Epoch 8/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 74ms/step - accuracy: 0.2436 - loss: 2.3232
Epoch 8: val_loss improved from 2.33138 to 2.29568, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 77ms/step - accuracy: 0.2427 - loss: 2.3215 - val_accuracy: 0.2157 - val_loss: 2.2957 - learning_rate: 1.0000e-04
Epoch 9/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 71ms/step - accuracy: 0.2186 - loss: 2.2519
Epoch 9: val_loss improved from 2.29568 to 2.27044, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 74ms/step - accuracy: 0.2171 - loss: 2.2536 - val_accuracy: 0.2549 - val_loss: 2.2704 - learning_rate: 1.0000e-04
Epoch 10/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2218 - loss: 2.1913
Epoch 10: val_loss improved from 2.27044 to 2.26747, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.2209 - loss: 2.1928 - val_accuracy: 0.2549 - val_loss: 2.2675 - learning_rate: 1.0000e-04
Epoch 11/20
26/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2001 - loss: 2.2228
Epoch 11: val_loss improved from 2.26747 to 2.24480, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 74ms/step - accuracy: 0.2007 - loss: 2.2219 - val_accuracy: 0.2745 - val_loss: 2.2448 - learning_rate: 1.0000e-04
Epoch 12/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 77ms/step - accuracy: 0.2629 - loss: 2.1499
Epoch 12: val_loss improved from 2.24480 to 2.22419, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 79ms/step - accuracy: 0.2630 - loss: 2.1508 - val_accuracy: 0.2353 - val_loss: 2.2242 - learning_rate: 1.0000e-04
Epoch 13/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2225 - loss: 2.2148
Epoch 13: val_loss improved from 2.22419 to 2.21153, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2237 - loss: 2.2109 - val_accuracy: 0.2549 - val_loss: 2.2115 - learning_rate: 1.0000e-04
Epoch 14/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2792 - loss: 2.1106
Epoch 14: val_loss improved from 2.21153 to 2.20730, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2779 - loss: 2.1116 - val_accuracy: 0.2745 - val_loss: 2.2073 - learning_rate: 1.0000e-04
Epoch 15/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2980 - loss: 2.1514
Epoch 15: val_loss improved from 2.20730 to 2.19979, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 73ms/step - accuracy: 0.2969 - loss: 2.1514 - val_accuracy: 0.2157 - val_loss: 2.1998 - learning_rate: 1.0000e-04
Epoch 16/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 71ms/step - accuracy: 0.2842 - loss: 2.1513
Epoch 16: val_loss improved from 2.19979 to 2.19116, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 73ms/step - accuracy: 0.2843 - loss: 2.1491 - val_accuracy: 0.2157 - val_loss: 2.1912 - learning_rate: 1.0000e-04
Epoch 17/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2821 - loss: 2.0929
Epoch 17: val_loss improved from 2.19116 to 2.19076, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2809 - loss: 2.0935 - val_accuracy: 0.2157 - val_loss: 2.1908 - learning_rate: 1.0000e-04
Epoch 18/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2972 - loss: 2.1153
Epoch 18: val_loss improved from 2.19076 to 2.15424, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2982 - loss: 2.1121 - val_accuracy: 0.2549 - val_loss: 2.1542 - learning_rate: 1.0000e-04
Epoch 19/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 70ms/step - accuracy: 0.2608 - loss: 2.0674
Epoch 19: val_loss improved from 2.15424 to 2.15281, saving model to basic_cnn_model_2_best.weights.h5
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 72ms/step - accuracy: 0.2640 - loss: 2.0656 - val_accuracy: 0.2549 - val_loss: 2.1528 - learning_rate: 1.0000e-04
Epoch 20/20
25/26 ━━━━━━━━━━━━━━━━━━━━ 0s 69ms/step - accuracy: 0.2518 - loss: 2.1404
Epoch 20: val_loss did not improve from 2.15281
26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 71ms/step - accuracy: 0.2541 - loss: 2.1379 - val_accuracy: 0.2745 - val_loss: 2.1570 - learning_rate: 1.0000e-04
In [1434]:
plot_training_history(basic_cnn_model_2_history, basic_cnn_model_2, X_test, y_test, model_name="Basic CNN 2")
No description has been provided for this image
🔍 Final Epoch Metrics:
📈 Training Accuracy     : 0.28
📉 Training Loss         : 2.11
📈 Validation Accuracy   : 0.27
📉 Validation Loss       : 2.1570

🧪 Test Accuracy         : 0.29
🧪 Test Loss             : 2.12
In [1379]:
# Evaluate the trained classification model on the test set
# ----------------------------------------------------------
# basic_cnn_model_1     : The trained Keras model that will be evaluated
# X_test                : Test feature data (e.g., images) for model prediction
# y_test                : True labels (can be one-hot encoded or class indices) for evaluating predictions
# y_train               : Optional — training labels used to fit LabelEncoder on all classes (helps preserve class label mapping)

evaluate_classification_model(basic_cnn_model_2, X_test, y_test, y_train=y_train)
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
Classification Report:
              precision    recall  f1-score   support

   Apple Pie       0.50      0.20      0.29         5
   Chocolate       0.30      0.60      0.40         5
French Fries       0.00      0.00      0.00         5
      Hotdog       0.33      0.20      0.25         5
      Nachos       0.20      0.20      0.20         5
       Pizza       0.40      0.40      0.40         5
 onion_rings       0.25      0.40      0.31         5
    pancakes       0.67      0.33      0.44         6
spring_rolls       0.23      0.50      0.32         6
       tacos       0.00      0.00      0.00         5

    accuracy                           0.29        52
   macro avg       0.29      0.28      0.26        52
weighted avg       0.29      0.29      0.26        52

No description has been provided for this image
In [1381]:
test_loss, test_acc = basic_cnn_model_2.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_acc * 100:.2f}%")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step - accuracy: 0.2965 - loss: 2.1394
Test Accuracy: 28.85%
In [1387]:
# Visualize predictions on random test images
# Arguments:
# - X_test              : array of test images (preprocessed, shape like (N, H, W, C))
# - y_test              : one-hot encoded true labels for test images
# - class_names         : list of class label names corresponding to indices
# - basic_cnn_model_1   : trained classification model
# - num_samples         : number of random samples to display (default is 5)

plot_random_predictions(X_test, y_test, class_names, basic_cnn_model_1, num_samples=20)
No description has been provided for this image

Basic CNN Model 3 With Data Augmentation¶

In [1615]:
import pandas as pd

# Create a copy of the original annotations DataFrame to avoid modifying it directly
# Group by 'filename' to aggregate all rows for each unique image filename
# For each group, take the first row (useful if there are multiple annotations per image)
# Reset the index so the result is a clean DataFrame with default integer indexing
annotations_df = food_annotations_df.copy().groupby('filename').first().reset_index()
In [1616]:
# Display or inspect the processed annotations DataFrame
# This DataFrame contains one row per unique filename
# Each row corresponds to the first annotation found for that image file
annotations_df
Out[1616]:
filename width height class xmin ymin xmax ymax label
0 0301-hotdog_jpg.rf.d9d8524fb7b25b7e1549de49ab6... 640 640 Hotdog 137 21 619 616 3
1 0909-hotdog_jpg.rf.92cd083aea53111f94e0b935d6a... 640 640 Hotdog 151 182 543 599 3
2 100148_jpg.rf.2206fe7bfbb84220314498f35fae6bbb... 512 384 French Fries 126 61 500 367 2
3 100284-nachos_jpg.rf.d8125a679ab339aaad0c57e48... 640 640 Nachos 0 143 558 597 4
4 101450_jpg.rf.eddcc68593aa541ba3d9cce8835094be... 512 512 pancakes 22 179 229 306 7
... ... ... ... ... ... ... ... ... ...
508 99074-nachos_jpg.rf.a0d38f7404682ad9fe4c2eb54a... 640 640 Nachos 0 24 640 604 4
509 99076-nachos_jpg.rf.072c2c0791659ef3c42d13d950... 640 640 Nachos 0 33 640 596 4
510 99087-nachos_jpg.rf.6996cfcf63d57f957a64256cec... 640 640 Nachos 2 48 640 640 4
511 99088-nachos_jpg.rf.75fbb62b1beeceecb85275f7dd... 640 640 Nachos 0 101 640 605 4
512 9949_jpg.rf.79c214a1830051934e6cfe1f557c6f86.jpg 512 512 pancakes 33 182 486 458 7

513 rows × 9 columns

In [1617]:
from sklearn.model_selection import train_test_split

train_df, temp_df = train_test_split(annotations_df, test_size=0.2, stratify=annotations_df['class'], random_state=42)
val_df, test_df = train_test_split(temp_df, test_size=0.5, stratify=temp_df['class'], random_state=42)
In [1618]:
print(train_df.shape)
print(test_df.shape)
print(val_df.shape)
(410, 9)
(52, 9)
(51, 9)
In [1619]:
# Check Label Distribution
train_df['class'].value_counts().plot(kind='bar')
Out[1619]:
<Axes: xlabel='class'>
No description has been provided for this image

Check Image Files Exist and Are Correctly Referenced

  • Before using flow_from_dataframe(), you should verify that all image filenames in your dataframe actually exist in your image folde
In [1620]:
import os

# Function to check if image files exist
def check_image_files(df, img_dir):
    missing_files = []
    for fname in df['filename']:
        if not os.path.isfile(os.path.join(img_dir, fname)):
            missing_files.append(fname)
    return missing_files

# Check train, val, test dataframes
missing_train = check_image_files(train_df, img_folder)
missing_val = check_image_files(val_df, img_folder)
missing_test = check_image_files(test_df, img_folder)

print(f"Missing files in training data: {len(missing_train)}")
print(f"Missing files in validation data: {len(missing_val)}")
print(f"Missing files in test data: {len(missing_test)}")

if missing_train:
    print("Some missing training images:", missing_train[:5])
if missing_val:
    print("Some missing validation images:", missing_val[:5])
if missing_test:
    print("Some missing test images:", missing_test[:5])
Missing files in training data: 0
Missing files in validation data: 0
Missing files in test data: 0
In [1621]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

img_folder = 'Datasetv1/original_images/'

train_datagen = ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    rotation_range=20,
    zoom_range=0.2,
    shear_range=0.1,
    width_shift_range=0.1,      # added width shift
    height_shift_range=0.1,     # added height shift
    fill_mode='nearest'         # better filling of pixels after transformations
)
 
val_test_datagen = ImageDataGenerator(rescale=1./255)

# Flow from DataFrame
train_generator = train_datagen.flow_from_dataframe(
    dataframe=train_df,
    directory=img_folder,
    x_col='filename',
    y_col='class',
    target_size=(128, 128),
    class_mode='categorical',
    batch_size=32,
    shuffle=True
)

val_generator = val_test_datagen.flow_from_dataframe(
    dataframe=val_df,
    directory=img_folder,
    x_col='filename',
    y_col='class',
    target_size=(128, 128),
    class_mode='categorical',
    batch_size=32,
    shuffle=False
)

test_generator = val_test_datagen.flow_from_dataframe(
    dataframe=test_df,
    directory=img_folder,
    x_col='filename',
    y_col='class',
    target_size=(128, 128),
    class_mode='categorical',
    batch_size=32,
    shuffle=False
)
Found 410 validated image filenames belonging to 10 classes.
Found 51 validated image filenames belonging to 10 classes.
Found 52 validated image filenames belonging to 10 classes.
In [1622]:
print(train_generator.class_indices)
print(test_generator.class_indices)
print(val_generator.class_indices)
{'Apple Pie': 0, 'Chocolate': 1, 'French Fries': 2, 'Hotdog': 3, 'Nachos': 4, 'Pizza': 5, 'onion_rings': 6, 'pancakes': 7, 'spring_rolls': 8, 'tacos': 9}
{'Apple Pie': 0, 'Chocolate': 1, 'French Fries': 2, 'Hotdog': 3, 'Nachos': 4, 'Pizza': 5, 'onion_rings': 6, 'pancakes': 7, 'spring_rolls': 8, 'tacos': 9}
{'Apple Pie': 0, 'Chocolate': 1, 'French Fries': 2, 'Hotdog': 3, 'Nachos': 4, 'Pizza': 5, 'onion_rings': 6, 'pancakes': 7, 'spring_rolls': 8, 'tacos': 9}
In [1623]:
images, labels = next(train_generator)
print("Labels shape:", labels.shape)
print("Example label[0]:", labels[0])
Labels shape: (32, 10)
Example label[0]: [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
In [1624]:
# Get mapping from class index to label name
class_indices = train_generator.class_indices
inv_class_indices = {v: k for k, v in class_indices.items()}
num_images = 5
plt.figure(figsize=(15, 5))
for i in range(num_images):
    ax = plt.subplot(1, num_images, i + 1)
    plt.imshow(images[i])
    class_index = np.argmax(labels[i])
    class_label = inv_class_indices[class_index]
    plt.title(f"{class_label}")
    plt.axis("off")
plt.tight_layout()
plt.show()
No description has been provided for this image
In [1625]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dense, Dropout, BatchNormalization, Input
from tensorflow.keras.regularizers import l2
from tensorflow.keras.optimizers import Adam

basic_cnn_model_3 = Sequential([
    Input(shape=(128, 128, 3)),

    Conv2D(32, (3, 3), activation='relu', padding='same'),
    MaxPooling2D((2, 2)),
    
    Conv2D(64, (3, 3), activation='relu', padding='same'),
    MaxPooling2D((2, 2)),

    Conv2D(64, (3, 3), activation='relu', padding='same'),
    MaxPooling2D((2, 2)),

    Conv2D(128, (3, 3), activation='relu', padding='same'),
    MaxPooling2D((2, 2)),

    GlobalAveragePooling2D(),  # ✅ Fix: Reduce 3D to 1D

    Dense(128, activation='relu', kernel_regularizer=l2(0.001)),
    Dropout(0.5),

    Dense(len(class_names), activation='softmax')
])


basic_cnn_model_3.compile(optimizer=Adam(learning_rate=1e-4),
                          loss='categorical_crossentropy',
                          metrics=['accuracy'])

basic_cnn_model_2.summary()
Model: "sequential_73"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_278 (Conv2D)             │ (None, 128, 128, 32)   │           896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_176         │ (None, 128, 128, 32)   │           128 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_271               │ (None, 64, 64, 32)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_279 (Conv2D)             │ (None, 64, 64, 64)     │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_177         │ (None, 64, 64, 64)     │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_272               │ (None, 32, 32, 64)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_280 (Conv2D)             │ (None, 32, 32, 64)     │        36,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_178         │ (None, 32, 32, 64)     │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_273               │ (None, 16, 16, 64)     │             0 │
│ (MaxPooling2D)                  │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ global_average_pooling2d_21     │ (None, 64)             │             0 │
│ (GlobalAveragePooling2D)        │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_147 (Dense)               │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_191 (Dropout)           │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_148 (Dense)               │ (None, 10)             │         1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 199,072 (777.63 KB)
 Trainable params: 66,250 (258.79 KB)
 Non-trainable params: 320 (1.25 KB)
 Optimizer params: 132,502 (517.59 KB)
In [1626]:
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint

# Callbacks for better training control
callbacks = [
    EarlyStopping(monitor='val_loss', min_delta=0.01, patience=20, verbose=1, mode='auto'),
    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=10, verbose=1, mode='auto'),
    ModelCheckpoint("basic_cnn_model_3_augmented.weights.h5", save_best_only=True, save_weights_only=True ,verbose=1)
]
# Train the model
basic_cnn_model_3_history = basic_cnn_model_3.fit(
    train_generator,
    validation_data=val_generator,
    epochs=50,
    callbacks=callbacks
)
Epoch 1/50
/Users/shashik/projects/greatlearning/AIML-Computer-Vision-Food-101-Detection/.venvfood101/lib/python3.10/site-packages/keras/src/trainers/data_adapters/py_dataset_adapter.py:121: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_multiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 104ms/step - accuracy: 0.0813 - loss: 2.4357
Epoch 1: val_loss improved from inf to 2.43156, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 124ms/step - accuracy: 0.0826 - loss: 2.4358 - val_accuracy: 0.0980 - val_loss: 2.4316 - learning_rate: 1.0000e-04
Epoch 2/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.0868 - loss: 2.4333
Epoch 2: val_loss improved from 2.43156 to 2.42882, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.0865 - loss: 2.4332 - val_accuracy: 0.0980 - val_loss: 2.4288 - learning_rate: 1.0000e-04
Epoch 3/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 98ms/step - accuracy: 0.1229 - loss: 2.4247
Epoch 3: val_loss improved from 2.42882 to 2.42639, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1220 - loss: 2.4247 - val_accuracy: 0.0980 - val_loss: 2.4264 - learning_rate: 1.0000e-04
Epoch 4/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step - accuracy: 0.0879 - loss: 2.4245
Epoch 4: val_loss improved from 2.42639 to 2.42318, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 112ms/step - accuracy: 0.0898 - loss: 2.4244 - val_accuracy: 0.0980 - val_loss: 2.4232 - learning_rate: 1.0000e-04
Epoch 5/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step - accuracy: 0.1191 - loss: 2.4227
Epoch 5: val_loss improved from 2.42318 to 2.42027, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 117ms/step - accuracy: 0.1186 - loss: 2.4225 - val_accuracy: 0.0784 - val_loss: 2.4203 - learning_rate: 1.0000e-04
Epoch 6/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step - accuracy: 0.1185 - loss: 2.4208
Epoch 6: val_loss improved from 2.42027 to 2.41766, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 114ms/step - accuracy: 0.1182 - loss: 2.4208 - val_accuracy: 0.1176 - val_loss: 2.4177 - learning_rate: 1.0000e-04
Epoch 7/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.0991 - loss: 2.4196 
Epoch 7: val_loss improved from 2.41766 to 2.41515, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.0995 - loss: 2.4194 - val_accuracy: 0.1373 - val_loss: 2.4152 - learning_rate: 1.0000e-04
Epoch 8/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 114ms/step - accuracy: 0.1366 - loss: 2.4131
Epoch 8: val_loss improved from 2.41515 to 2.41279, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 122ms/step - accuracy: 0.1350 - loss: 2.4132 - val_accuracy: 0.1569 - val_loss: 2.4128 - learning_rate: 1.0000e-04
Epoch 9/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step - accuracy: 0.1246 - loss: 2.4108
Epoch 9: val_loss improved from 2.41279 to 2.40999, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 109ms/step - accuracy: 0.1253 - loss: 2.4107 - val_accuracy: 0.1765 - val_loss: 2.4100 - learning_rate: 1.0000e-04
Epoch 10/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 112ms/step - accuracy: 0.1081 - loss: 2.4053
Epoch 10: val_loss improved from 2.40999 to 2.40816, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 122ms/step - accuracy: 0.1082 - loss: 2.4052 - val_accuracy: 0.1569 - val_loss: 2.4082 - learning_rate: 1.0000e-04
Epoch 11/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step - accuracy: 0.0783 - loss: 2.4051
Epoch 11: val_loss improved from 2.40816 to 2.40503, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 113ms/step - accuracy: 0.0798 - loss: 2.4047 - val_accuracy: 0.0980 - val_loss: 2.4050 - learning_rate: 1.0000e-04
Epoch 12/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 113ms/step - accuracy: 0.1346 - loss: 2.3998
Epoch 12: val_loss improved from 2.40503 to 2.40480, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 121ms/step - accuracy: 0.1343 - loss: 2.3995 - val_accuracy: 0.1176 - val_loss: 2.4048 - learning_rate: 1.0000e-04
Epoch 13/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.0947 - loss: 2.3915
Epoch 13: val_loss improved from 2.40480 to 2.40207, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.0951 - loss: 2.3912 - val_accuracy: 0.1373 - val_loss: 2.4021 - learning_rate: 1.0000e-04
Epoch 14/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step - accuracy: 0.1866 - loss: 2.3851
Epoch 14: val_loss improved from 2.40207 to 2.40019, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.1831 - loss: 2.3851 - val_accuracy: 0.1176 - val_loss: 2.4002 - learning_rate: 1.0000e-04
Epoch 15/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1689 - loss: 2.3773
Epoch 15: val_loss improved from 2.40019 to 2.39724, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1680 - loss: 2.3769 - val_accuracy: 0.1373 - val_loss: 2.3972 - learning_rate: 1.0000e-04
Epoch 16/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.1245 - loss: 2.3802
Epoch 16: val_loss improved from 2.39724 to 2.39545, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 109ms/step - accuracy: 0.1250 - loss: 2.3801 - val_accuracy: 0.1961 - val_loss: 2.3955 - learning_rate: 1.0000e-04
Epoch 17/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.1297 - loss: 2.3773
Epoch 17: val_loss improved from 2.39545 to 2.39436, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.1309 - loss: 2.3765 - val_accuracy: 0.1961 - val_loss: 2.3944 - learning_rate: 1.0000e-04
Epoch 18/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1346 - loss: 2.3585 
Epoch 18: val_loss did not improve from 2.39436
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1351 - loss: 2.3588 - val_accuracy: 0.1961 - val_loss: 2.3958 - learning_rate: 1.0000e-04
Epoch 19/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.1582 - loss: 2.3586
Epoch 19: val_loss did not improve from 2.39436
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.1586 - loss: 2.3583 - val_accuracy: 0.1765 - val_loss: 2.3960 - learning_rate: 1.0000e-04
Epoch 20/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1883 - loss: 2.3370
Epoch 20: val_loss did not improve from 2.39436
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 105ms/step - accuracy: 0.1870 - loss: 2.3386 - val_accuracy: 0.2157 - val_loss: 2.3977 - learning_rate: 1.0000e-04
Epoch 21/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1457 - loss: 2.3335 
Epoch 21: val_loss improved from 2.39436 to 2.39341, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.1471 - loss: 2.3339 - val_accuracy: 0.1569 - val_loss: 2.3934 - learning_rate: 1.0000e-04
Epoch 22/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 146ms/step - accuracy: 0.1565 - loss: 2.3496
Epoch 22: val_loss did not improve from 2.39341
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 157ms/step - accuracy: 0.1559 - loss: 2.3493 - val_accuracy: 0.1569 - val_loss: 2.3961 - learning_rate: 1.0000e-04
Epoch 23/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step - accuracy: 0.1441 - loss: 2.3391
Epoch 23: val_loss improved from 2.39341 to 2.39192, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 115ms/step - accuracy: 0.1438 - loss: 2.3402 - val_accuracy: 0.1176 - val_loss: 2.3919 - learning_rate: 1.0000e-04
Epoch 24/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.1752 - loss: 2.3543
Epoch 24: val_loss did not improve from 2.39192
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.1740 - loss: 2.3536 - val_accuracy: 0.1373 - val_loss: 2.3964 - learning_rate: 1.0000e-04
Epoch 25/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.1988 - loss: 2.3224
Epoch 25: val_loss did not improve from 2.39192
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.1978 - loss: 2.3230 - val_accuracy: 0.1373 - val_loss: 2.3958 - learning_rate: 1.0000e-04
Epoch 26/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 99ms/step - accuracy: 0.1210 - loss: 2.3666 
Epoch 26: val_loss did not improve from 2.39192
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 106ms/step - accuracy: 0.1211 - loss: 2.3657 - val_accuracy: 0.1569 - val_loss: 2.3922 - learning_rate: 1.0000e-04
Epoch 27/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.2076 - loss: 2.3352
Epoch 27: val_loss did not improve from 2.39192
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 107ms/step - accuracy: 0.2060 - loss: 2.3349 - val_accuracy: 0.1765 - val_loss: 2.3939 - learning_rate: 1.0000e-04
Epoch 28/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step - accuracy: 0.1501 - loss: 2.3447
Epoch 28: val_loss improved from 2.39192 to 2.39116, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 114ms/step - accuracy: 0.1523 - loss: 2.3430 - val_accuracy: 0.2157 - val_loss: 2.3912 - learning_rate: 1.0000e-04
Epoch 29/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 120ms/step - accuracy: 0.1856 - loss: 2.3212
Epoch 29: val_loss did not improve from 2.39116
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 127ms/step - accuracy: 0.1872 - loss: 2.3201 - val_accuracy: 0.1373 - val_loss: 2.3913 - learning_rate: 1.0000e-04
Epoch 30/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 145ms/step - accuracy: 0.1809 - loss: 2.3547
Epoch 30: val_loss did not improve from 2.39116
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 152ms/step - accuracy: 0.1807 - loss: 2.3533 - val_accuracy: 0.2157 - val_loss: 2.3924 - learning_rate: 1.0000e-04
Epoch 31/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 132ms/step - accuracy: 0.1870 - loss: 2.3019
Epoch 31: val_loss improved from 2.39116 to 2.38810, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 143ms/step - accuracy: 0.1865 - loss: 2.3018 - val_accuracy: 0.0980 - val_loss: 2.3881 - learning_rate: 1.0000e-04
Epoch 32/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 132ms/step - accuracy: 0.1851 - loss: 2.2787
Epoch 32: val_loss did not improve from 2.38810
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 139ms/step - accuracy: 0.1841 - loss: 2.2803 - val_accuracy: 0.1961 - val_loss: 2.3949 - learning_rate: 1.0000e-04
Epoch 33/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 115ms/step - accuracy: 0.1771 - loss: 2.2976
Epoch 33: val_loss did not improve from 2.38810
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 123ms/step - accuracy: 0.1755 - loss: 2.2986 - val_accuracy: 0.1569 - val_loss: 2.3931 - learning_rate: 1.0000e-04
Epoch 34/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 105ms/step - accuracy: 0.1700 - loss: 2.3473
Epoch 34: val_loss improved from 2.38810 to 2.38520, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 115ms/step - accuracy: 0.1694 - loss: 2.3458 - val_accuracy: 0.1176 - val_loss: 2.3852 - learning_rate: 1.0000e-04
Epoch 35/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 119ms/step - accuracy: 0.1659 - loss: 2.2985
Epoch 35: val_loss did not improve from 2.38520
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 126ms/step - accuracy: 0.1664 - loss: 2.3000 - val_accuracy: 0.1765 - val_loss: 2.3889 - learning_rate: 1.0000e-04
Epoch 36/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 107ms/step - accuracy: 0.1671 - loss: 2.3073
Epoch 36: val_loss improved from 2.38520 to 2.37912, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 116ms/step - accuracy: 0.1682 - loss: 2.3059 - val_accuracy: 0.1765 - val_loss: 2.3791 - learning_rate: 1.0000e-04
Epoch 37/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step - accuracy: 0.1918 - loss: 2.2898
Epoch 37: val_loss did not improve from 2.37912
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 110ms/step - accuracy: 0.1910 - loss: 2.2897 - val_accuracy: 0.1765 - val_loss: 2.3797 - learning_rate: 1.0000e-04
Epoch 38/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step - accuracy: 0.1756 - loss: 2.3046
Epoch 38: val_loss did not improve from 2.37912
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 108ms/step - accuracy: 0.1775 - loss: 2.3036 - val_accuracy: 0.1961 - val_loss: 2.3804 - learning_rate: 1.0000e-04
Epoch 39/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step - accuracy: 0.1921 - loss: 2.2727
Epoch 39: val_loss did not improve from 2.37912
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.1922 - loss: 2.2729 - val_accuracy: 0.1373 - val_loss: 2.3799 - learning_rate: 1.0000e-04
Epoch 40/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step - accuracy: 0.1646 - loss: 2.2872
Epoch 40: val_loss improved from 2.37912 to 2.37143, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 118ms/step - accuracy: 0.1648 - loss: 2.2880 - val_accuracy: 0.1765 - val_loss: 2.3714 - learning_rate: 1.0000e-04
Epoch 41/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step - accuracy: 0.2073 - loss: 2.2880
Epoch 41: val_loss improved from 2.37143 to 2.37129, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 113ms/step - accuracy: 0.2074 - loss: 2.2872 - val_accuracy: 0.2157 - val_loss: 2.3713 - learning_rate: 1.0000e-04
Epoch 42/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 103ms/step - accuracy: 0.1858 - loss: 2.2772
Epoch 42: val_loss did not improve from 2.37129
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 110ms/step - accuracy: 0.1861 - loss: 2.2772 - val_accuracy: 0.1961 - val_loss: 2.3717 - learning_rate: 1.0000e-04
Epoch 43/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 119ms/step - accuracy: 0.2006 - loss: 2.2632
Epoch 43: val_loss improved from 2.37129 to 2.36643, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 128ms/step - accuracy: 0.2006 - loss: 2.2628 - val_accuracy: 0.1961 - val_loss: 2.3664 - learning_rate: 1.0000e-04
Epoch 44/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step - accuracy: 0.2224 - loss: 2.2806
Epoch 44: val_loss did not improve from 2.36643
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 109ms/step - accuracy: 0.2219 - loss: 2.2805 - val_accuracy: 0.1765 - val_loss: 2.3872 - learning_rate: 1.0000e-04
Epoch 45/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step - accuracy: 0.1432 - loss: 2.3116
Epoch 45: val_loss improved from 2.36643 to 2.35331, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 110ms/step - accuracy: 0.1457 - loss: 2.3092 - val_accuracy: 0.1569 - val_loss: 2.3533 - learning_rate: 1.0000e-04
Epoch 46/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 100ms/step - accuracy: 0.2155 - loss: 2.2864
Epoch 46: val_loss improved from 2.35331 to 2.34631, saving model to basic_cnn_model_3_augmented.weights.h5
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 109ms/step - accuracy: 0.2150 - loss: 2.2859 - val_accuracy: 0.2157 - val_loss: 2.3463 - learning_rate: 1.0000e-04
Epoch 47/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 109ms/step - accuracy: 0.2252 - loss: 2.2581
Epoch 47: val_loss did not improve from 2.34631
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 116ms/step - accuracy: 0.2234 - loss: 2.2585 - val_accuracy: 0.2353 - val_loss: 2.3540 - learning_rate: 1.0000e-04
Epoch 48/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 107ms/step - accuracy: 0.2225 - loss: 2.2332
Epoch 48: val_loss did not improve from 2.34631
13/13 ━━━━━━━━━━━━━━━━━━━━ 2s 114ms/step - accuracy: 0.2216 - loss: 2.2342 - val_accuracy: 0.2157 - val_loss: 2.3660 - learning_rate: 1.0000e-04
Epoch 49/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step - accuracy: 0.1731 - loss: 2.2515
Epoch 49: val_loss did not improve from 2.34631
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 109ms/step - accuracy: 0.1741 - loss: 2.2514 - val_accuracy: 0.2157 - val_loss: 2.3574 - learning_rate: 1.0000e-04
Epoch 50/50
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 101ms/step - accuracy: 0.2409 - loss: 2.2055
Epoch 50: val_loss did not improve from 2.34631
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 108ms/step - accuracy: 0.2392 - loss: 2.2075 - val_accuracy: 0.2549 - val_loss: 2.3521 - learning_rate: 1.0000e-04
In [1583]:
def plot_training_history_generator(history, model, test_generator, model_name="Model"):
    import matplotlib.pyplot as plt

    # Plot
    plt.figure(figsize=(12, 5))

    # Accuracy
    plt.subplot(1, 2, 1)
    plt.plot(history.history['accuracy'], label='Train Accuracy')
    plt.plot(history.history['val_accuracy'], label='Val Accuracy')
    plt.title(f'{model_name} - Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.grid(True)

    # Loss
    plt.subplot(1, 2, 2)
    plt.plot(history.history['loss'], label='Train Loss')
    plt.plot(history.history['val_loss'], label='Val Loss')
    plt.title(f'{model_name} - Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True)

    plt.tight_layout()
    plt.show()

    # Print last epoch stats
    print(f"\n📊 Final Training Accuracy: {history.history['accuracy'][-1]:.4f}")
    print(f"📊 Final Validation Accuracy: {history.history['val_accuracy'][-1]:.4f}")

    # Evaluate on test set
    test_loss, test_acc = model.evaluate(test_generator, verbose=0)
    print(f"\n🧪 Test Accuracy: {test_acc:.4f}")
    print(f"🧪 Test Loss    : {test_loss:.4f}")
In [1628]:
plot_training_history_generator(basic_cnn_model_3_history, basic_cnn_model_3, test_generator=test_generator, model_name="Basic CNN 3")
No description has been provided for this image
📊 Final Training Accuracy: 0.2171
📊 Final Validation Accuracy: 0.2549

🧪 Test Accuracy: 0.2308
🧪 Test Loss    : 2.1786
In [1627]:
test_loss, test_acc = basic_cnn_model_3.evaluate(test_generator)
train_loss, train_acc = basic_cnn_model_3.evaluate(train_generator)
print(f"Test Accuracy: {test_acc*100:.2f}%")
print(f"Train Accuracy: {train_acc*100:.2f}%")
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 29ms/step - accuracy: 0.2163 - loss: 2.1965
13/13 ━━━━━━━━━━━━━━━━━━━━ 1s 75ms/step - accuracy: 0.2576 - loss: 2.2257
Test Accuracy: 23.08%
Train Accuracy: 26.83%
In [1630]:
from sklearn.metrics import classification_report, confusion_matrix
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

def evaluate_model_predictions(model, test_generator, class_names):
    # Step 1: Predict
    predictions = model.predict(test_generator)
    predicted_classes = np.argmax(predictions, axis=1)
    true_classes = test_generator.classes

    # Step 2: Classification Report
    print("Classification Report:")
    print(classification_report(true_classes, predicted_classes, target_names=class_names, zero_division=0))

    # Step 3: Confusion Matrix
    cm = confusion_matrix(true_classes, predicted_classes)
    plt.figure(figsize=(12, 8))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklabels=class_names)
    plt.xlabel("Predicted")
    plt.ylabel("True")
    plt.title("Confusion Matrix")
    plt.show()
In [1631]:
evaluate_model_predictions(basic_cnn_model_3, test_generator, class_names)
2/2 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step
Classification Report:
              precision    recall  f1-score   support

   Apple Pie       0.33      0.20      0.25         5
   Chocolate       0.25      0.80      0.38         5
French Fries       1.00      0.20      0.33         5
      Hotdog       0.00      0.00      0.00         5
      Nachos       0.00      0.00      0.00         5
       Pizza       0.31      0.80      0.44         5
 onion_rings       0.00      0.00      0.00         5
    pancakes       0.00      0.00      0.00         6
spring_rolls       0.00      0.00      0.00         6
       tacos       0.22      0.40      0.29         5

    accuracy                           0.23        52
   macro avg       0.21      0.24      0.17        52
weighted avg       0.20      0.23      0.16        52

No description has been provided for this image
In [1632]:
import random
import matplotlib.pyplot as plt
import numpy as np

def plot_random_predictions_generator(test_generator, class_names, model, num_samples=5):
    """
    Plots random samples from one batch of the test_generator with predicted and actual labels.
    Correct predictions are shown in green, incorrect in red.

    Args:
        test_generator (DirectoryIterator or DataFrameIterator): Keras test data generator
        class_names (list): List of class names corresponding to label indices
        model (keras.Model): Trained classification model
        num_samples (int): Number of random samples to display (default: 5)
    """
    # Get one batch of data
    images, labels = next(test_generator)
    
    # Limit num_samples to batch size
    num_samples = min(num_samples, images.shape[0])
    indices = random.sample(range(images.shape[0]), num_samples)

    cols = 5
    rows = (num_samples + cols - 1) // cols

    plt.figure(figsize=(cols * 3, rows * 3))

    for i, idx in enumerate(indices):
        img = images[idx]
        true_label = np.argmax(labels[idx])

        pred_probs = model.predict(np.expand_dims(img, axis=0), verbose=0)
        pred_label = np.argmax(pred_probs)

        color = 'green' if pred_label == true_label else 'red'
        title_text = f"Pred: {class_names[pred_label]}\nActual: {class_names[true_label]}"

        plt.subplot(rows, cols, i + 1)
        plt.imshow(img)
        plt.title(title_text, color=color, fontsize=10)
        plt.axis('off')

    plt.suptitle("Model Predictions on Test Generator Batch", fontsize=16)
    plt.tight_layout()
    plt.subplots_adjust(top=0.85)
    plt.show()
In [1634]:
plot_random_predictions_generator(test_generator, class_names, basic_cnn_model_3, num_samples=20)
No description has been provided for this image

CNN Model Comparison Report¶


Overview¶

Aspect Model 1 Model 2 Model 3
Architecture Basic CNN (3 conv layers) Deep CNN with 5 conv blocks Deep CNN + Data Augmentation
Overfitting Yes – severe No – well-regularized No – regularized and generalized better
Regularization Dropout only Dropout + BatchNorm Dropout + BatchNorm + Data Augmentation
Learning Curve Early overfit Steady slow learning Gradual, steady learning with improved accuracy

Detailed Observations¶

1. Underfitting in Model 1 & 2, Better Learning in Model 3¶

  • Models 1 and 2 showed low training/validation accuracy, indicating underfitting.
  • Model 3 showed consistent improvement (e.g., ~22.5% validation accuracy by epoch 6).
  • Data augmentation helped learn more generalized features.

2. Low Precision, Recall, and F1-Scores¶

  • Most classes had low recall and F1-scores, especially in Models 1 & 2.
  • Some improvement in Model 3 for certain distinctive classes (e.g., spring_rolls, chocolate).

3. Improvements from Augmentation¶

  • Model 3 used ImageDataGenerator to apply:
    • Rotation, zoom, flips, brightness adjustments, etc.
  • Helped mitigate overfitting and improved learning.

4. Learning Stability¶

  • Model 3 showed gradual and stable learning (no sharp spikes).
  • Training and validation loss decreased in sync.

Model Architecture Feedback¶

Common Weaknesses (Models 1 & 2):¶

  • Shallow CNNs with limited filters.
  • Use of Flatten() increased overfitting risk.
  • Lack of complex feature extractors or residual connections.

Model 3 Improvements:¶

  • Introduced BatchNormalization, Dropout, and Augmentation.
  • More stable validation metrics, though still modest performance.

Recommendations¶

Model Enhancements:¶

  • Add more Conv2D blocks with higher filter sizes.
  • Use BatchNormalization and Dropout after every block.
  • Replace Flatten() with GlobalAveragePooling2D.

Data Handling:¶

  • Keep aggressive ImageDataGenerator usage.
  • Use class weights or oversampling to address class imbalance.

Training Strategy:¶

  • Train for 50–100 epochs.
  • Include:
    • EarlyStopping
    • ReduceLROnPlateau
    • Possibly learning rate warm-up or scheduling.

Upgrade to Transfer Learning:¶

  • Use a pretrained model like:
    • MobileNetV2, EfficientNetB0, or ResNet50.
  • Freeze base layers and fine-tune last few layers.
  • Ideal for small datasets and many classes.

Conclusion¶

  • Model 3 showed clear improvements over Models 1 & 2.
  • Still limited by:
    • Modest architecture
    • Small, imbalanced dataset

Next Steps:

  • Shift to transfer learning (Use a pretrained model like: MobileNetV2, EfficientNetB0, or ResNet50.
  • Improve data diversity
  • Scale training efforts

Transfer learning + augmentation is the most effective path forward.